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Kahle,   Page  1 


Interfaces  for  information  access  and  retrieval  are  a  long  way  from  the  ideal  of  the  electronic  book  that  you 
can  cuddle  up  with  in  bed.   Nevertheless,  today's  interfaces  are  coming  closer  to  supporting  browsing, 


selection,  and  retrieval  of  remote  information  by  nonytechnical  users. 

"     C^J:  This  -paper,  describes   five   interfaces   to   distributed   systems   of  servers   that  have  been   designed 
and  implemented:  WAIStation  for  the  Macintosh,  XWAIS  for  X-Windows,  GWAIS  for  Gnu-Emacs, 
SWAIS  for  dumb  terminals,  and  Rosebud  for  the  Macintosh.  These  interfaces  talk  to  one  of  two 
server  systems:   the  Wide  Area  Information  Server  (WAIS)  system  on  the  internet,  and  the  Rosebud 
Server  System,  on  an  internal  network  at  Apple  Computer.  Both  server  systems   are  built  on 
Z39.50,  a  standard  protocol,  and  thus  support  access  to  a  wide  range  of  remote  databases. 

^   The  interfaces  described  here  reflect  a  variety  of  design  constraints.  Such  constraints  range  from  the 

mundane-coping  with  dumb  terminals  and  limited  screen  space— to  the  challenging.  Among  the  challenges 
addressed  ai"e  how  to  provide  passive  alerts,  how  to  make  information  easily  scannable,  and  how  to  support 
retrieval  and  browsing  by  nontechnical  users.  There  are  a  variety  of  other  issues  which  have  received  little  or 
no  attention,  including  budgeting  money  for  access  to  "for  pay"  databases,  privacy,  and  how  to  assist  users  in 
finding  out  which  of  a  large  (changing)  set  of  databases  holds  relevant  information.  We  hope  that  the 
challenges  we  have  identified,  as  well  as  the  existence  and  public  availability  of  source  code  for  the  WAIS 
system,  will  serve  as  a  stimulus  for  further  design  work  on  interfaces  for  information  retrieval. 
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yy  It  requires  little  prescience  to  predict  that  one  day  computers  will  put  an  ocean  of  information  at  the  finger  tips 
of  a  vast  population  of  users.  However,  although  there  is  a  considerable  amount  of  infonnation  available  from 
remote  sources,  the  bulk  of  it  is  accessible  only  to  information  professionals,  or  users  with  technical 
backgrounds.  A  variety  of  obstacles  effectively  block  the  ordinary  user  from  accessing  information  via  the 
computer.  These  obstacles  include  the  difficulty  of  locating  appropriate  information  sources,  the  cumbersome 
maneuvers  needed  to  get  online  and  to  connect  to  remote  sources,  and  cryptic  query  languages.   Furthermore, 
even  if  a  user  has  succeeded  in  accessing  a  remote  information  source,  it  is  likely  that  it  will  have  its  own 
special  purpose  interface,  which  may  or  may  not  support  the  user's  needs. 

^''  In  this  paper  we  describe  two  systems|-L|Wide  Area  Infonnation  Servers  (WAIS)/and  Rosebud— which  provide 
a  protocol-based  mechanism  for  accessing  a  variety  of  remote,  full-text  information  servers.  These  systems 
have  the  potential  for  supporting  a  single  interface  to  a  wide  variety  of  information  sources,  and  offer  a  good 
platform  on  which  to  explore  the  design  of  interfaces  for  information  retrieval.  After  a  summary  of  existing 
information  retrieval  systems,  we  describe  the  server  systems,  and  then  describe  the  five  interfaces  to  them.  In 
the  course  of  these  descriptions  we  discuss  design  constraints,  interface  issues,  and  practical  matters  which 
had  an  impact  on  the  designs.  We  conclude  with  a  summary,  and  some  remarks  on  important  issues  that  have 
not  been  addressed,  and  a  invitation  for  other  investigators  to  use  the  WAIS  system  as  a  platform  for  exploring 
interfaces  to  multiple,  remote  information  sources. 
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/'/  ^ '     Existing  Systetns 

;/  While  a  review  of  all  existing  systems  is  beyond  the  scope  of  this  paper,  it  is  useful  to  list  a  number  of  the 
most  popular  or  significant  interfaces  for  information  retrieval. 

Commercial  interfaces  for  accessing  fuUtext  resources  on  computers  can  be  broken  down  into  dialup  services, 

local  file  access,  and  LAN-based  access  tools.  Dialup  systems  such  as  Dialog  and  Dow  Jones  offer  TTY 

interfaces  to  users,  with  menus  and  command  lines  being  the  dominant  access  tools.    Some  dialup  services  are 

offering  client  programs  that  run  on  personal  computers  to  add  graphical  interfaces  such  as  "Navigator"  by 

CompuServe.   In  general,  these  interfaces  are  unique  to  the  information  provider.  Local  file  access  through 

fuUtext  indexing  has  been  achieved  in  command  line  form  (e.g.^the  Unix  command  "grep")  and  in  screen-  '~' 

.based  interfaces  (e.g.  ON  Location  fONf,  and  Digital  Librarian  |NeXT|).  These  interfaces  often  give  browsing 
„ J 
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and  searching  capabilities  for  local  files.   Some  of  these  interfaces  have  been  stretched  to  work  with  files  on 

file  servers.  LAN-based  access  tools  usually  use  some  sort  of  query  language  to  access  servers  on  the  net, 

such  as  Verity's  Topic  system  (VERITY),  and  numerous  library  systems.   These  query  languages  require  some 

user  training.  Integrated  tools  for  cross-platform,  cross^endor  information  access  are  not  currently  available  in 

A  A 

other  systems. 

c  ^ 

(S\  A  variety  of  research  projects  have  explored  information  retrieval  systems.   The  SuperBook  project  (Egan, 
1989)  targets  users  of  static  information.  Project  Mercury  (Ginther- Webster,  1990)  is  a  remote  library 
searching  system  that  uses  a  client-server  model.   Information  Lens  (Malone,  1986)  is  a  structured  ejmail 
system  for  assisting  in  managing  corporate  information.  NetLib  for  software  0Z)ongarra,  1987)  and  Mosis  for 
information  on  how  to  fabricate  chips  (Mosis)  are  examples  of  emai0)ased  information  retrieval  systems. 

/  2))  JJie  WAIS  and  Rosebud  Projects 

S!^  The  two  systems  of  information  servers  described  in  this  panfcr  grew  out  of  two,  partially  entwined  projects: 

f-  J 

WAIS,' and  Rosebud.  A  goal  of  both  projects  was  to  define  an  open  protocol  that  would  allow  any  user 

interface  or  information  server  that  talked  to  the  protocol  to  interact  with  any  other  component  that  used  the 

protocol.  From  the  user's  perspective,  this  would  mean  that  user  interfaces  and  information  sources  could  be 

mixed  and  matched,  according  to  the  user's  needs. 

c  ) 

WAIS  started  as  a  joint  project  between  Thinking  Machines  Corporation,  Apple  Computer,  Dow  Jones  &  Co., 

tS   Med  U>r-  /} 
and  KPMG  Peat  Marwick  (Kahleiet  al^  ^^^^j)-  ^^®  proximate  goal  was  to  define  the  open  protocol  and 

demonstrate  its  feasibility  by  unplementing  and  demonstrating  a  multifvendor  system  which  provided  ordinary 

users  with  access  to  a  variety  of  remote  databases.  Thinking  Machines  contributed  its  Connection  Machine- 

based  retrieval  technology,  Apple  contributed  its  expertise  in  user  studies  and  interface  design,  and  Dow  Jones 

&  Co.  provided  access  to  its  commercial  information  sources.  KPMG  Peat  Marwick  provided  access  to  its 

corporate  data,  and  served  as  a  site  for  user  studies  and  testing.  The  WAIS  system  was  installed  at  KPMG 

Peat  Marwick  and  enabled  the  designers  to  study  the  success  of  the  system  in  a  real  world  context.  The  WAIS 

system  uses  pseudo-natural  language  queries,  relevance  feedback  to  refine  queries,  and  accesses  fuUtext, 

N  A 

unstructured  information  sources.   These  technologies  were  used  because  they  had  already  been  tested 
independently,  thereby  leading  to  faster  implementation  of  the  complete  system.  The  WAIS  system  will  be 
described  in  more  detail  in  the  next  section. 

(  ) 

During  the  same  period,  the  Rosebud  project  was  underway  within  Apple.  Rosebud's  goal  was  to  serve  as  an 

mtemal  platform  for  research  into  system  architecture  and  human  interface  issues,/  and  as  a  consequence,^ 

employed  a  variety  of  more  experimental  technologies  and  was  tested  in-house.  Like  WAIS,  Rosebud  was 

A 
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based  on  user  studies  conducted  at  KPMG  Peat  Marwick.  and  used  the  same  underlying  protocol,  Z39.50.  The 
details  of  the  Rosebud  Server  System  will  be  described  in  a  different  paper.  iftudY(.j 


WAIStation 


XWAIS 


GWAIS 


SWAIS 


Rosebuc 


Wide  Area  Information  Server  (WAIS)  Systen 
(InterNet) 


Rosebud  Server  System 
(Apple  Engineering  Network » 


Z39.50  Protocol 
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Tfie  interfaces  to  the  WAIS  and  Rosebud  sen'er  system,  and  the  protocol. 


After  the  collaboraUve  phase  of  the  WAIS  project  came  the  Internet  experiment.  In  this  phase  of  WAIS,  source 
code  for  the  open  protocol,  information  servers,  and  for  several  interfaces  were  made  freely  available  over  the 
Internet  In  addition,  Thinking  Machines  established  and  maintained  a  directory  of  information  servers^jtEat 
WAIS  users  could  query  to  find  out  about  available  information  sources.  This  phase  of  WAIS  is  still  in 
progress,  and  has  resulted  in  the  creaUon  of  new  interfaces,  the  availability  over  the  Internet  of  more  than  flr^- 
hundred  servers  on  three  continents,  and  over  100,000  searches  of  the  directory  of  ser\'ers.  In  the  first  six 
months  of  the  Internet  experunent,  approxunately  4000  u^ers  fromJO  countries  have  tried  this  system,  with  no 
training  other  than  documentation  (Kahleit-^,'  199||)).'  Administrators  of  popular  information  servers  indicate 
that  they  are  getting  over  50  accesses  a  day  from  many  countries. 


f7&  I 


I  ( ')})      The  WAIS  System 

y4    WAIS  employs  a  client-server  model  using  a  standard  protocol  (based  on  Z39.50)  to  allow  users  to  find  and 
retrieve  informadon  from  a  large  number  of  servers.   The  client  program  is  the  user  interface,  the  server  does 
the  indexing  and  retrieval  of  documents,  and  the  protocol  is  used  to  transmit  the  queries  and  responses.   Any 
client  which  is  capable  of  translaUng  a  user's  request  into  the  standard  protocol  can  be  used  in  the  system. 
Likewise,  any  server  capable  of  answering  a  request  encoded  in  the  protocol  can  be  used. 

<\\)A  WAIS  server  can  be  located  anywhere  that  one's  workstation  has  access:  on  the  local  machine,  on  a 

network,  or  on  the  other  end  of  a  modem.   The  user's  workstation  keeps  track  of  a  variety  of  information  about 
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each  server.   The  public  information  about  a  server  includes  how  to  contact  it,  a  description  of  the  contents, 
and  the  access  cost.  ^ 


c 


1^1    The  WAIS  protocol  (Davis jet  al,  1990)  is  an  extension  of  the  existing  Z39.50  standard  (NISO,  1988)  from 
NISO.  It  has  been  augmented  where  necessary  to  incorporate  many  of  the  needs  of  a  full-text  information 
retrieval  system.  To  allow  future  flexibility,  the  standard  does  not  restrict  the  query  language  or  the  data 
format  of  the  information  to  be  retrieved.  Nonetheless,  a  query  convention  has  been  established  for  the 
existing  servers  and  clients.  The  resulting  WAIS/Protocol  is  general  enough  to  be  unplemented  on  a  variety  of 
communications  systems. 


C,a)  The  WAIS  clients  will  be  described  in  detail  in  the  next  several  sections.  However,  all  of  them  work  in  a 
basically  similar  way.  On  the  client  side,  queries  are  expressed  as  strings  of  words,  often  pseudo  natural 
language  questions.  The  client  application  then  packages  the  query  in  the  WAIS  protocol  and  transmits  it  over 
a  network  to  one  or  more  servers.   The  servers  receive  the  transmission,  translate  the  received  packet  into  then- 
own  query  languages,  and  search  for  documents  satisfying  the  query.  The  lists  of  relevant  documents  are  then 
encoded  in  the  protocol  and  transmitted  back  to  the  client.  The  client  decodes  the  response  and  displays  the 
results.  The  documents  can  then  be  retrieved  from  the  server.  The  documents  can  be  in  any  fonnat  that  the 
client  can  display  such  as  word  processor  files  or  pictures. 
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WAISTATION:  AN  INTERACTIVE  QUERY  INTERFACE 
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WAIStation  At  1  Glance 


\ 
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Target  Machine 

Effort 

Number  of  Users 

Status 

Language 

Communications 

Designer 

Organization 

Availability 

Design  Goals 


Used 


Problems 


Macintosh  Plus  and  above,  9"  Monochrome  screen. 

1  person-year 

2000 

Finished,  freely  distributed 

ThinkC 

TCP/IP  and  Modem  (not  supported) 

Harry  Morris 

Thinking  Machines 

Available  for  anonymous  FTP  from 
/public/waisAVAIS  tation*.sit.hqx  @  think.com 

Implementable  quickly,  support  interactive  queries  well, 
changeable  based  on  user's  comments,  make  something  very  simple 
to  learn  (partner  friendly).  Try  out  many  ideas:  interactive  queries, 
passive  alerting,  asking  multiple  servers. 

In  a  study  with  accountants  and  tax  consultants  at  KPMG:  very 
good  user  acceptance.  In  the  Internet  experiment:  estimated  that 
half  of  the  uses  of  WAIS  are  using  WAIStation.  (Based  on  when  the 
directory  of  servers  did  not  work  for  Macintoshes,  usage  dropped  to 
half). 

Dealing  with  the  directory  of  servers  (s).  Modem  code  was  difficult 
to  get  right.  / 
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WAIStation  was  designed  for  use  in  the  WAIS  experiment  at  PCPMG  Peat  Marwick.  As  such,  we  needed  an 

interface  that  would  be  easy  to  use,  and  would  encourage  successful  searches  by  users  untrained  in  search 

,  f 
techniques.  Peat  Marwick  often  sends  its  employees  into  the  field  toting  their  Macintosh  SE*^^  along  for  use  as 

portable  computers.    Thus  we  had  to  design  the  interface  to  run  on  a  9-ihch  black-and-white  screen,  and  make 

minimal  demands  on  CPU  and  memory.    Furthermore,  WAIStation  was  designed  for  use  over  modems  and 

slow  LANs. 


X 
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i     /l  - '  ' ")    Design  Rationale 


Oi, 

In  designing  WAIStation,  we  were  informed  by  two  metaphors  -'  search  as  conversation,/and  storage  by  file 
folder.   The  process  of  formulating  an  effective  search  is  highly  interactive.    Of  the  documents  which  match  a 
query,  the  ones  which  match  "best"  are  displayed.   One  or  more  may  be  of  interest,  in  which  case,  they  can  be 
fed  back  to  the  system,  interactively  improving  the  search.     We  choose  to  view  this  process  as  a 
conversation.   Thus  the  initial  natural  language  question  becomes  that  starting  point  for  give  and  take  between 
the  user  and  the  server(s).     Relevance  feedback  provides  the  context  for  the  question.     As  the  search 
proceeds,  some  results  may  suggest  alternative  searches  or  branches  of  the  conversation.  This  is  provided  for 
by  allowing  several  questions  to  evolve  at  the  same  time. 

':■  ) 

I  Eventually  one  or  more  questions  may  be  refined  to  the  point  where  they  are  finding  consistently  good  results. 
At  this  point,  the  question  can  be  automated,  becoming  a  dynamically  updated  file  folder.   At  intervals  these 
questions  wake  up  and  query  their  servers.  The  results  are  stored  in  the  results  field  for  later  inspection.  They 
can  be  thought  of  as  regular  Macintosh  folders,  except  as  augmented  with  a  charter  describing  how  to  keep 
their  contents  up  to  date. 


i_jy  This  parallel  with  the  Macintosh  folder  structure  suggested  a  drag  and  drop  construction  for  the  user  interface 
itself.   Constructing  a  question  is  a  three  step  process  -typing  the  key  words,  specifying  the  servers  to  use,  an 

A  M 

Specifying  the  relevant  documents  to  feed  Back.  If  we  think  of  questions  like  Macintosh  folders,  we  can  use 
the  Macintosh's  drag-and-drop  mechanism  for  putting  sources  and  relevant  documents  into  a  question.   This 
approach  makes  WAIStation's  mechanics  instantly  familiar  to  users  of  the  Macintosh  finder. 


/    .,/  2,3     ^'"«'f  Interface 


Ki. 


/;  When  WAIStation  starts  up,  two  windows  appear  i-  one  contains  the  users  available  Sources  (see  below)  and 
one  contains  the  users  saved  Questions.  Sources  are  identified  by  an  eye  icon,  questions  by  a  question  mark 
icon. 

(  ) 

v„|  S    Double  clicking  on  a  question  icon  opens  the  stored  question,  including  any  new  results  found  since  the  last 
time  it  was  examined.  The  top  half  of  the  question  window  contains  a  field  in  which  to  type  key  words  (the 
natural  language  part  of  the  question),  a  list  of  relevant  documents,  a  list  of  sources,  and  a  list  of  result 
headlines.   Sources  can  be  added  to  the  question  by  selecting  a  source  icon  (in  the  Sources  window),  and 
dragging  it  into  the  question.    Relevant  documents  are  specified  in  the  same  way.  ^ 

C  jl  Result  documents,  returned  by  the  servers,  can  be  examined  by  double  clicking  on  their  icon.  Note  that  the 
result  list  contains  a  graphical  indication  of  how  well  each  document  matches  the  query.   The  original  graphic 
was  a  series  of  ^  to  4  stars,  similar  to  the  ratings  found  in  TV  Guide.    We  thought  that  this  rating  scheme 
would  be  easily  recognized.    Experience  proved  that  the  stars  did  not  provide  enough  information  to  be 
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recognized  or  to  discriminate  among  the  documents.    Latter  versions  of  tlie  software  replaced  the  stars  with  a 
horizontal  bar  giving  20  levels  of  resolution, 

^J  ft]  Any  of  the  resulting  documents  can  be  opened  and  viewed  in  its  own  window.   WAIStation  supports  plain 
ASCII  documents  as  well  as  PICT  format  pictures.    Text  windows  automatically  scroll  to  the  position  which 
the  server  considers  the  most  relevant  part  of  the  document.   This  allows  the  user  to  quickly  determine  if  a  file 
is  useful.  In  order  to  perform  well  over  slow  communications  channels  (modems  and  slow  LANs)  the  text  is 
downloaded  on  demand  in  15  line  chunks.    The  keywords  used  in  the  query  are  automatically  highlighted  in 
bold.  \ 

^.  ) 

Sources  are  specially  formatted  text  files  which  describe  information  servers  and  how  to  get  to  them.  Double 
clicking  on  a  source  displays  a  window  with  several  controls.  The  top  part  is  information  specified  by  the 
server  itself: 

^  r;,  ^-1( 

K   ',  \      'jii     'a  pop-up  menu  to  specify  the  method  of  contacting  the  server  (ip-address/tcp-port,  modem  number 
,  and  speed,  or  location  of  a  local  index); 

j  y     '  3.  script  to  run  after  logging  in  (for  use  by  modems); 

I  ^  -) 

I  S*     *  3  database  to  search  (servers  can  support  multiple  databases); 

i  ,^  ('         ■)  ^ 

j  S''     'a  display  of  when  the  server  is  updated,  how  much  it  costs  to  searcli;  and 

I  n  ^       ""'^ 

YJ     •  a  textual  description  of  the  databases'  contents. 

The  bottom  half  of  the  source  window  allows  the  user  to  specify  personal  information  about  the  server: 

.,  .,  „       ...     Qv 
*v^l  '^  ■  -'  ^    'ii^    •  when  to  contact  it  (for  automatic  update)^ 

'/I'     •  when  it  was  last  contacted' 

"^fl)    *  how  much  to  spend  on  it' 

"--vj    •  how  much  credence  its  results  should  be  given  (this  is  used  to  scale  document  scores,  which  helps 

in  the  sorting  of  responses  to  questions  asked  of  multiple  servers)' 

■  '\  '-* 

Vf      •  the  number  of  documents  to  ask  for  when  searching  it  >   "S/Vx 

'~p    'the  font  and  type  size  to  use  when  displaying  plain  text  results  (important  to  publishers)  ,s^ 

Several  of  these  fields  are  merely  place  holders  in  the  current  implementation.   In  particular,  budget  and 

confidence  have  not  been  implemented  yet  since  there  are  no  for-pay  servers  yet,  and  the  number  of  sources  is 

A 
still  relatively  small. 
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Clj  Source  files  can  also  be  retrieved  from  servers.  This  allows  users  to  search  servers  whose  database  elements 
are  pointers  to  other  servers.  The  results  can  be  used  as  targets  for  further  searches.   An  experimental  directory 
of  servers  is  being  maintained  on  the  Internet. 


/ 1  A     Implementation 


'^a.l)  WAIStation  was  implemented  in  Think  C  4.0  using  the  object  oriented  class  library.   It  took  about  a  man-year 
of  effort.    The  most  difficult  parts  were  the  automatic  update  facility  and  the  communications.   Automatic 
JtTpdate  required  the  ability  to  do  background  processing  -  which  is  not  a  normal  part  of  the  Macintosh 
operating  system.   Conununications  were  difficult  primarily  because  we  were  simultaneously  debugging  the 
Z39.50  protocol,  modem  code,  and  the  (then  new)  Apple  Communications  Toolbox.     We  eventually  left 
modems  unsupported,  and  replaced  the  Communications  Toolbox  with  direct  calls  to  MacTCP.     Through  this 
experience  we  found  that  communications  speeds  of  less  than  9600  baud  were  barely  tolerable  for  interactive 
text  retrieval. 

'  /  /'  )      Observations 

i  '  '    We  estimate  that  WAIStation  is  now  in  use  by  over  2000  users  in  20  countries.    The  common  user  complaints 
center  around  configuring  MacTCP,  using  (the  undocumented)  directory-of-servers,  and  avoiding  a  bug 

\     A 

requiring  the  software  to  be  installed  on  the  start-up  disk. 

(' 
"ji  We  have  noticed  several  shortcomings  in  the  current  design: 


i^. 


r'^ ) 


J 


Users  want  access  to  their  ovm  data.  WAIStation  is  capable  of  searching  a  Macintosh-based 
inverted  index  file,  but  we  unbundled  the  index  builder  when  we  realized  how  much  work  it 
would  take  to  make  it  useful  under  Macintosh  OS.    OnLocation  (On  Technology)  is  an 
implementation  of  a  Macintosh  indexer  that  could  be  used. 

"  1  i*  •      Interaction  with  the  directory  of  servers  is  incomplete.  It  is  not  obvious  which  search  results 
are  source  files,  and  what  to  do  with  the  ones  that  are.  It  should  be  possible  to  drag  a 
retrieved  source  directly  into  a  question's  source  window,  but  the  present  interface  requkes 
that  it  be  saved  first.  The  lesson  we  learned  was  that  special  cases  should  be  handled 
specially,  rather  than  forcing  users  to  use  general  techniques  "for  consistency's  sak^t/i^ 

)l    •      Prmting  documents  and  searching  for  keywords  in  documents  (find/find-next)  are  simple 

A 

functions  which  users  expect. 
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•  People  want  to  see  their  documents  in  their  original  form.   WAIStation  currently  only  displays 

ASCII  and  PICT.  This  can  be  fixed  with  format  filters  such  as  Claris'  XTND,  at  the  expense 

of  the  ability  to  download  arbitrary  sections  of  a  document,  since  such  filters  require  that  the 

document  be  processed  from  the  beginning.  \ 

('  ) 

•  Relevance  feedback  was  not  obvious.   Users  unfamiliar  with  the  use  of  relevance  feedback 

did  not  think  to  use  it  -  it  needs  to  be  made  more  automatic.   One  way  to  do  this  might  be  to 

/■A. 

extend  the  notion  that  a  question  is  a  conversation,  with  relevance  feedback  as  context  (or 
body  language)  -  clients  or  servers  can  be  written  that  watch  their  users,  and  deduce  which 
documents  were  relevant  based  on  which  ones  were  read.   A  simpler  approach  might  be  to 
always  do  relevance  feedback,  presenting  the  results  in  a  "see  also"  list.    We  tried  this,  but 
the  Macintosh  was  too  slow  to  make  it  useful. 

■^ 

•  Communications  over  2400 -baud  modems  are  too  slow  to  support  interactive  queries.   We 

found  that  9600  baud  is  barely  acceptable,  while  56Kb  is  sufficient  to  support  several  users. 

•  The  finder-like  interface  (drag  and  drop)  is  not  obvious.  Even  though  the  Macintosh  finder  is 
based  on  drag  and  drop,  no  one  expected  it  in  an  application.    Once  users  were  shown  what 
to  do,  it  became  natural.   It  was  also  not  necessarily  the  best  use  of  screen  space,  since  it 
required  that  both  the  stmt  and  end  of  the  drag  be  visible  on  the  screen  at  the  same  time. 
Another  anomaly  worth  mentioning  is  the-fael  that  although  we  were  simulating  the  finder, 

'T 

we  had  no  "trash  can"  analogy.  Removing  a  source  was  accomplished  by  dragging  it  onto 
the  desk  top  and  dropping  it  there,  which  confused  some  users. 


<4l' 


•      The  alerting  system  was  crude.  For  example,  there  was  no  visual  cue  to  tell  the  user  that  a 
question  had  found  new  documents  in  the  background.   Also,  the  background  searches  did  not 
exclude  previously  read  documents. 


do  noV 


; 


•      Headlines  often  doA  give  enough  context.   The  headUnes  displayed  in  the  question  window 
were  only  about  60  characters  long,  making  it  difficult  to  identify  which  documents  were 
useful  without  opening  them.  Furthermore,  there  was  no  provision  to  display  the  document's 
date  or  the  name  of  the  source  it  came  from. 


Kahle,   Page  11 


cta  ^  i^^i' 


c 


0^=  Sources 


<9>  CM  ipplicitiofts 

<S>  Encyclopedia 

<S>  Kvv)  J*rn»  Bib1» 

<S>  MJoVitOfh  Hard  Disk 

■ao  TMC  Business  wnail 

<a>  TNC  L*r»r!j..g.jjjijg5;:«;  jij::: 


<t>  VorM  F»ctl»ok 


0- 


Q 


IDH Questions  j 


7  Forest  Industrg  ^ 

?  CMP  6f  Mill 

9  Chambers  Acct. 

?  VSJ  update 

9  Personel report 

7  Mao  Networking 

"p  M^r^<etin3  Stratew 

^ 


Q 


Question-] 


Loolc  for  tlocuments  about 


gira 


Vhich  arg  strail-^^  <oIb  thpse  sourors 


^1 


<*  yjffSf.^Mrru/ 


m 


o 


n 


Figure  2  '  WAIStation's  Sources  and  Questions  windows  store  the  user's  personal  objects.  Dragging  a 
source  into  a  question  window  specifies  that  the  question  will  contact  the  source  in  order  to 
fulfill  its  charter. 


Kahle,   Page  12 


Look  for  documents  about 


recent  developments  in  personal 
computers! 


Vhich  are  similar  to  In  these  sources 


1  <S>    ViUSt  UWTMl' 


Results 


o 


E)  ♦**  Compaq  Computer  Directors  Approve  2-for-1  Stock  Split 
g)  *»«  htf rnatioMl :  Bull  Agrees  to  Psy  2«ni'h  *1 5  ^iT™  **  ^"' 
El  .**  AT&T  Set  to  Annour^e  Mwnorex  Computer  Accord 
g  ♦»*  Technology  Brief—  International  Business  Machines 

Business  Brief--  Data  General  Corp.:  Four  Models  AreUr 


.  PriciiLj 


r^   Ar. 


d  Enters  Japan,  Aided  by  4  Big  Loca^ 


IDI  Technology:  Computer  Firms  See  the  LDriting  i  =1 


International  Business  Machines  Corp.,  Apple  Computer  Inc 
ond  other  big  computer  nnekersare  staking  out  positions  In 
the  nascent  market  for  "note-pad  computers,"  small  machines 
that  let  users  enter  data  by  vriti ng  rather  than  tappi ng 
keys.  The  note  pads  typically  recognize  numbers  and  letters 
printed  on  a  screen  vlth  a  special  pen  and  convert  them  into 
conventional  electronic  characters.  The  information  is  then 
stored  for  later  transfer  to  a  persorwl  computer  or  a 
company's  main  computers. 
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\     Figi^  i  ^     Afier  running  the  question,  results  are  displayed  in  a  scrolling  list.  Double  clicking  on  a 


result  opens  a  document  window.  Query  words  are  highlighted. 


V>v,. 
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Figure's       Double  clicking  on  a  source  icon  opens  a  source  window. 
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X  WINDOWS  BASED  INTERFACE  FOR  WAIS:  XWAIS 
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XWAIS^it\Glance 

Target  Machine 

X-windows  terminals  on  unix  machines 

Effort 

4  person-months 

Number  of  Users 

500 

Status 

Finished,  freely  distributed 

Language 

C 

Communications 

TCP/IP 

Designer 

Jonathan  Goldman 

Organization 

Thinking  Machines 

Availability 

Available  anonymous  FTP  from  /public/wais/wais*.tar.Z@think.com 

Design  Goals 

Copy  WAIStation  so  that  we  can  leverage  one  design,  portable,  and 

based  only  on  freeware  Display  data  in  many  different  formats 

(image,  text,  etc)^ 

'(3 

Used  in  the  Internet  experiment.  Heavy  use  by  X  users  within 

Used 

Thinking  Machines  and  outside^. 

Problems 

Installing  it  has  caused  many  users  to  stumble.   The  number  of 

variables  (architectures,  X  directory  structures)   makes  it  difficult  to 

make  it  portable,  touch  on  the  ability  to  handle  different  types  (this 

is  unique  to  this  interface).  Uses  other  programs  to  help  (like 

interapplication  communication)  q 

o 


fl 


I'lOihi',  (I  ih  ■■ 

.      .  ',r     ,     I      I     I 


The  WAIS  interface  for  the  X-Windows  environment  was  developed  for  the  Internet  experiment  to  provide  an 

,  A 

X'Windows»based  interface  for  a  growing  community.  It  was  built  to  look  as  much  like  the  Macintosh  WAIS 

A,  A* 

interface  (WAIStation)  as  possible,  given  the  limitations  of  the  freely  distributed  X-Windows  software.   Since 

A 

the  metaphors  in  XWAIS  are  nearly  the  same  as  those  for  WAIStation,  a  user  of  one  system  can  easily  move 

ixd  d  (  (76/1  .-fl  (     I  A'/  c-  cm  (A  ff  a/l  (V  y 

to  the  other,  without  having  to  learn  muchjnev^j  In  fact,  the  underlying  data  structures  are  identical  to  those  in 
WAIStation,  so  questions  can  be  copied  from  a  Macintosh  to  a  UNIX  machine  running  XWAIS,  and  used 
without  modification. 
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XWAIS  supports  interactive  WAIS  access,  including  question  entering,  source  selection,  addition  of  relevant 

documents,  and  pieces  of  documents.   Unlike  WAIStation,  XWAIS  retrieves  an  entire  document  when 

requested,  instead  of  just  the  parts  being  viewed.   We  decided  this  was  acceptable,  since  the  underlying 

networks  for  X  will  most  likely  be  fast. 

(  i 

Since  XWAIS  runs  under  X^Windows,  and  was  built  for  die  UNIX  operating  system,  it  can  take  advantage  of 

the  tools  available  for  these  systems  to  display  a  wide  range  of  document  formats.   A  simple  filter  interface  is 

provided  in  the  application  (as  an  X  resource)  to  allow  a  user  to  select  the  tool  requked  for  a  given  type  of 

document;  4rg,  if  the  document  is  a  Postscript  file,  xps  can  be  used  to  view  it.  This  is  a  feature  that-  Is  not 


available  in  any  of  the  other  user  interfaces  described  here.  \ 

(  ' 

In  order  to  distribute  this  software  without  restriction,  XWAIS  uses  the  freely  distributed  Athena  Widget  set 

included  in  the  X11R4  release  from  MIT.   Although  these  widgets  don't  appear  as  attractive  as  some  others 

that  are  available,  they  can  be  used  to  build  a  useful  interface.   Some  aspects  of  this  interface  ai'e  restiictcd  by 

the  nature  of  the  widgets  available,   XWAIS  was  built  using  the  Xt  X  Toolkit  Intrinsics,  and  allows  a  large 

amount  of  customization  of  the  appearance  of  the  display  using  X  resources.  The  application  relies  heavily  on 

the  Xt  resource  mechanism,  and  will  not  run  unless  these  resources  are  in  place.  The  "object-oriented"  feel  of 

these  widgets  made  building  the  interface  rather  easy,  once  the  widget  with  the  closest  desired  functionality 

was  found.  Finding  the  correct  widget  was  the  hardest  part.  Most  of  the  actual  behavior  of  the  interface  is 

controlled  by  "call-backs"  -  the  methods  that  widgets  inherit. 

A  /W 

The  XWAIS  application  is   actually  two  separate  applications:   XWAIS,  a  simple  shell  for  selecting  sources 

A       r' '    "\  /-' — ;   vi- 

and questions,'  and  xwaisqi  the  application  that  actually  performs  WAIS  transactions.   The  C  code  in  (xwaisd)  is 


also  used  in(waisq)  the  shell-support  program  for  GNU  Emacs  WAIS.  This  allows  users  to  use  simple  UNIX 
facilities  to  submit  questions  created  b'y  xwaisq  using(waisq;(e.g.  a  crontab  entry  to  periodically  query  a 
server). 


t 


lA 


) 


CiJ^  The  implementation  for  XWAIS  was  done  in  C  (6k  lines)/ using  the  X11R4  release  of  X  vj^indows  from  MIT, 
the  Xt  X  Toolkit  Intrinsics,  and  the  Athena  Widget  Set,  included  in  the  X- Windows  release. 

A 

'   ",;:  XWAIS  is  a  text-based  user  interface  built  in  a  graphical  window  environment.   Some  additional  graphical 
metaphors  would  be  desirable,  but  the  limited  widget  sets  precluded  that.  It  would  take  a  considerably  larger 
amount  of  work  to  add  much  graphics  to  this  application.   Perhaps  some  other  X  toolkit  would  provide  simpler 
methods  for  doing  this. 
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FiguK^6       The  XWAIS  interface,  including  the  Questions  and  Sources  windows,  and  an  open  question. 
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Figurh  7       A  document  displayed  in  the  XWAIS  interface. 
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GNU  EMACS  WAIS  INTERFACE:  GWAIS 
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GWAIS  kt  i  Glance 

Target  Machine 

Teraiinals  on  Unix  machines 

Effort 

2  person-months 

Number  of  Users 

500 

Status 

finished,  freely  distributed 

Language 

Gnu-Lisp,  and  C 

Communications 

TCP/IP 

Designer 

Jonathan  Goldman 

Organization 

Thinking  Machines 

Availability 

Available  anonymous  FTP  from  /public/wais/wais*.tar.Z@think.com 

Design  Goals 

Copy  WAIStation  so  that  we  can  leverage  one  design.  Use 

precedent  from  other  gnu-emacs  applications:  RMAIL,  dired 

Used 

Used  in  the  Internet  experiment  with  heavy  use  by  some  gnu-emacs 

users. 

Problems 

Dealing  with  the  directory  of  servers.  Using  passive  alerting. 
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My    The  WAIS  interface  on  GNU-Emacs/Unix  (GNU)  was  developed  specifically  for  the  Internet  experiment  for  a 

technically  strong  user  population.   The  reasons  it  was  developed  were:  the  large  munber  of  Emacs  users,  the 

extensibility,  the  ubiquitous  nature  of  character  display  terminals,  and  the  component  nature  of  Emacs  which 

f 
meant  WAIS  could  be  integrated  into  dmail,  bboards,  and  programming  tools. 

-  J  i)  The  design  of  the  interface  was  a  cross  between  WAIStation  and  other  emacs  interfaces.  The  direct 

manipulation  of  WAIStation  was  replaced  by  command  keys,  as  is  common  in  emacs  applications.   The      \ 
choice  of  command  keys  were  modeled  on  the(dire(mnd  RMAtL  Emacs  applications. 

ro     GWAIS  allows  users  to  access  the  interactive  features  of  WAIS:  question  entering,  relevance  feedback, 

displaying  document,  and  source  selection.   An  extra  feature,  not  found  in  the  other  interfaces,  is  an  interface 

to  an  indexer  for  creating  sources,  but  it  appears  that  this  feature  is  not  heavily  used.  Furthermore  it  allows 

f 
questions  to  be  saved,  but  it  depends  on  the  user  to  automate  the  update  of  questions  and  sources  using  cron  or 

other  I^ix  tools.  Graphic  documents  can  be  displayed  on  X  Windows  terminals  if  the  user  has  set  up  the 

environment  variables. 
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The  implementaUon  of  GWAIS  was  in  Emacs(@(2iah^^s)lndin  C  code  (3K  lines).  About  half  of  the 
time  of  a  typical  search  and  retrieval  is  spent  in  reading  the  data  intc(Lis^ 
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|iCREENjBASED  (TERMINAL)  WAIS  INTERFACE:  SWAIS 
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SWAIS  it  k  Glance 

Target  Machine 

Terminals  connected  to  Unix  systems 

Effort 

1  person-month 

Number  of  Users 

900 

Status 

Beta 

Language 

C 

Communications 

TCP/IP 

Designer 

John  Curran 

Organization 

NSF  Network  Service  Center 

Availability 

To  be  included  in  WAIS  release,  anonymous  FTP  from 

/public/wais/wais*.tar.Z@think.com 

Design  Goals 

Highly  portable,  provide  straight/forward  user  interface,  utilize 

existing  application  key  mappings  (m,  vi,  Emacs),  support  multiple 

servers  per  query,  allow  for  personal  "source"  directory  and  a 

common  source  directory,  allow  for  useful  source  discovery  via 

searches,  provide  simple  active  tool  with  little  state  (no  question 

storage,  relevance  feedback,  or  passive  notification)/, -v 

Used 

Internet  users  via  Telnet:  K-12  students,  educators,  user  services 

staff,  librarians,  and  (occasionally)  network  staff,  a 

Problems 

Dealing  with  the  directory  of  servers.  Lack  of  information  in  many 

server-returned  records.  Providing  simple  and  uniform  nomenclature. 

A 

Planning  for  large  numbers  of  sources. 

\ 


\. 


|(/  To  open  WAIS  to  a  wider  community  of  users,  an  interface  was  developed  to  run  on  dumb  terminals  or  over 
Telnet  sessions.   It  is  called  "SWAIS"  for  Screen  WAIS  since  it  is  uses  a  character  display  terminal  screen  for 
the  interface.  The  user  communities  that  this  interface  can  serve  are  dial-in  users.  Telnet  users,  and  low-end 
terminal  users. 


I 


^. 


'■'"h   The  design  of  the  interface  involved  three  screens:  a  single  screen  listing  all  known  servers  that  the  user  could 
pick  from;  a  list  of  search  result  documents  headlines;  and  a  document  display  screen.  Listing  all  servers  and 
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allowing  users  to  select  which  servers  to  use  encourages  users  to  ask  questions  of  mulUple  servers.  Unhke  the 
other  interfaces,  the  sources  list  shows  what  site  runs  it  and  how  much  it  costs  (if  any^ing).  The  resulung  q^ 
document  screen  includes  headlines  and  number  of  lines,  but  its  innovation  is  to  shovfhe  source. 
'  S.  It  does  not  handle  relevance  feedback  or  downloading  new  sources  from  the  directory  of  servers.  Another 
■'  drawback  is  using  it  with  large  numbers  of  sources,  since  moving  around  the  list  requires  scroUing.  On  the 

other  hand,  this  server  has  proven  to  be  very  popular  on  the  Internet.  Because  of  its  ease  of  use,  all  a  user  has 
V   to  do  is  use  Telnet  to  a  specific  machine  to  use  it. 
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Figure  9  ,^    The  SWAIS  query  building  screen.  The  poetry  source  is  selected,  and  search  terms  are 
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c<ip<TUFlHINO   )and   turning    In   lh«  .id«ning  gyr* 

Tha  falcon  cannot  hear  th«  falconor; 

Thingi   fall   t^ort;    thi  e«ntr«  cannot  hold; 

(1«~«  anorchg   is   loosed  upon  tha  norld, 

The  blood-dlPsKsd  tide   Is   loosed,   and  e«erv«here 

Th«  cermony  of    Innocence    is  dro»n«d; 

Th«  best  lock  all   conulotlon,   «hll«  the  »orst 

IVe   full    of  passionate    intensity. 

Surely  soft*  revelation  Is  at  hand; 

Si*-eiy   the  Second  Coning   is  at  hand. 

The  Second  Coning!  Hardly  are  those  "wrds  out 

When  a  vast   iiwge  out  of   HSpirltus  ^■^^) 

Troubles  oy  sight:     soMuhare  in  sands  of  tlia  desert 

n  ih<^e  ulth   lion  body  and  the  heod  of  a  tan, 

H  gaie  blank  and  pitiless  as  the  sun, 

Is  Bouing   its  sloe  thighs,   «ihi  ie  ail  about 

Reel    shodoos  of    the    indignant  desert  birds. 

The  darkness  drops  again;    but  no.    i    knoi) 

That  t»enty  centuries  of  stony  sleep 

Here  vexed  to  nIghlMre  by  o  rocking  crodle, 

find  luhot  rou^  beast,    its  hour  cor«  round  at    last, 


it 


U- 


Q 


ma 


^, 


o 


Figu%  11 


A  document  displayed  in  SWAIS. 
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THE  ROSEBUD  B«JTERFACE:   REPORTERS  AND  NEWSPAPERS  ON  THE  MACINTOSH 


^^    ho 


Rosebud  At  k  Glance 

Target  Machine 

Macintosh  II,  color  screen 

JVumber  of  Users 

25 

Status 

Finished;  internal  use 

Language 

Smalltalk,  MPW-C 

Communications 

TCP/IP  using  IPC  package 

/Designers 

Charlie  Bedard,  David  Casseres,  Steve  Cisler,  Tom  Erickson,  Ruth 

Ritter,  Eric  Roth,  Gitta  Salomon,  Kevin  Tiene,  Janet  Vratny- Watts. 

Organization 

Apple  Computer 

Availability 

Only  internally  to  Apple  ATG 

Design  Goals 

Serve  as  research  platform  for  interface  and  architectural 

explorations.  Allow  ordinary  users  to  create  personalized 

information  flows;  support  passive  alerting,  scanning,  and  capture  of 

information. 

Used 

Used  in  various  internal  tests;  not  available  for  the  Internet 

experiment. 

Problems 

No  good  interface  mechanisms  for  providing  users  with  convenient 

access  to  large  numbers  of  servers. 

-<A 
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^  J|\    Rosebud  is  a  project  within  Apple  Computer's  Advanced  Technology  Group.  Its  principle  objective  is  to  serve 
as  a  platform  for  investigations  into  what  is  needed  to  make  remote  information  accessible  and  useful  to 
ordinary  Macintosh  users.  The  investigations  have  two  foci:  human  interface  components  and  techniques,  and 
system  architecture  issues.  In  this  article  we  focus  exclusively  on  the  human  interface  aspects  of  Rosebud. 


^> 


( 


i;    The  Rosebud  Server  System  is  similar  to  the  WAIS  system  in  that  it  uses  the  Z39.50  protocol  to  access 
multiple,  remote  databases;  it  differs  from  them  in  that  it  contains  extra  underpinnings  for  making  information 
access  an  integral  part  of  the  Macintosh  environment.  Specifically,  the  Rosebud  Server  System  allows  users  to 
create  autonomous,  ongoing  "agent"  processes  which  access,  update,  and  present  information  from  local  and 
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remote  sources.  The  Rosebud  system  does  not  currenUy  provide  access  to  the  Internet  WAIS  servers  (for 
reasons  of  network  security,  rather  than  basic  incompatibiliUes),  and  is  not  publicly  available. 


fl'l 


e/j 


Design  Rationale 

The  design  of  the  Rosebud  interface  began  with  a  study  of  the  practices  and  problems  of  ordinary  information 
users.  The  principle  focus  was  on  informaUon  users  at  KPMG  Peat  Marwick  in  San  Jose,  the  original  client 
site  for  WAIS;  in  addition,  several  groups  of  users  of  online  information  services  within  Apple  were  also 
studied  (Erickson  &  Salomon,  1991).  Interviews  with  accountants  at  Peat  Marwick  enabled  the  designers  to 
put  together  a  schematic  of  how  information  (mostly  paper^^based  information)  flowed  through  their  offices 
(Figurl  12). 

O -     ■  - ■■ ' ^  ^ - „ 
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C  I    ^^       Figure  12      Informationflow  through  accountants' offices.        \ 

--"/       ;      '    >  ^,  ^  ■- -    ■ -'    - ^ - - ---'^' 

(|,\    Several  features  of  this  schemaUc  informed  the  design  of  Rosebud.  First,  information  typically  came  to  the 
accountants  via  newspapers,  magazines,  and  memos;  instances  where  the  accountants  went  out  of  their  way  to 
search  for  informaUon  were  less  frequent.   Second,  the  accountants  never  talked  about  "reading"  information; 
they  always  spoke  of  scanning,  or  skunming  itj^they  did#|  have  Ume  to  read  it.  This  suggested  that  a  good 
interface  should  provide  a  way  for  the  users  to  scan  retrieved  information  quickly.   Third,  accountants 
remarked  that  they  discarded  most  information,  including  information  that  might  be  useful.   Potentially  useful 
informaUon  was  discarded  for  two  reasons:  the  accountants  did  not  have  the  physical  space  to  store  everything, 
and  they  knew  from  experience  that  if  they  tried  to  save  too  much,  they  would  not  be  able  to  find  anyUiing 
later  when  Uiey  actually  needed  it.  This  suggested  that  giving  users  access  to  remote  informaUon  was  just  half 
the  problem;  users  also  needed  tools  for  archiving,  organizing,  and  refretrieving  informaUon.   Finally,  when 
users  did  come  across  informaUon  that  seemed  worUi  saving,  {hey  typically  would  cut  it  out  (the  accountants 

used  almost  exclusively,  paper-based  information),  and  then  they  would  annotate  it  by  circling,  underlining, 

'a 
or  jotUng  a  few  notes  in  Uie  margin.   AnnotaUon  turned  out  to  be  an  important  concept:  not  only  did  it  help  the 
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user  who  annotated  when  the  information  was  refretrieved  later,  but  it  also  helped  others  scan  the  information 
more  quickly  when  copies  were  passed  on  to  them. 

The  consequence  of  these  observations  was  a  design  for  a  system  which  allowed  users  to  define  topics  of 
interest  which  would  be  retrieved  automatically,  and  would  then  permit  them  to  scan  those  items  and  save 
them  into  an  environment  where  they  could  be  annotated,  organized,  and  r^retrieved. 


(  !(^)  £"^^ 


Human  Interface 

The  Rosebud  interface  design  has  three  components:  reporters,  newspapers,  and  notebooks.   Reporters  are  for 
retrievmg  information.  Users  give  reporters  assignments  which  specify  what  to  look  for,  and  where  to  look. 
This  is  shown  in  figure  13:  users  enter  words  describing  the  informaUon  in  which  they^  interested,  check  off 
the  informaUon  sources  they  wish  the  reporter  to  search,  and,  if  they  so  choose,  automate  the  reporter  so  that  it 
searches  the  databases  on  a  daily  or  weekly  basis.  Upon  pressing  the  "Search"  button  in  the  assignment 
window,  a  reporter  is  created,  perfoims  the  search,  and  returns  with  a  list  of  results  (Figi^  14). 
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Figuii  13      Creating  a  reporteiUftlie  assignment  window. 
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The  reporter  window  (Figurl  14)  provides  users  with  a  variety  of  ways  to  look  over  their  results,/ and  refine 
their  queries.   The  results  are  shown  in  the  "Best  Guesses"  pane.  (The  name  "  Best  Guesses"  was  chosen  to 
provide  some  indication  that  inaccuracy  could  be  expected;  our  observations  of  users  had  shown  that  they  were 
often  mystified  by  some  of  the  items  that  showed  up  as  the  results  of  searches.)  The  asterisks  to  the  left  of 
items  indicate  theu-  relative  relevance,  and  the  pop-up  menu  above  the  pane  allows  users  to  order  the  list  by 
date  or  relevance.   Simply  selecting  an  item  shows  a  preview  of  itv4a  short  excerpt  with  search  terms 


m 


highlighted  in  color  and  boldface  (Figufl  15).  Previews  are  useful  because  users  can  get  a  look  at  a  small  part 

0.) 
of  the  item  without  incurring  dae  overhead  of  downloading  the  whole  article  over  the  network.  Users  also  have 
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the  options  of  saving  articles  to  their  disks  or  opening  them  for  viewing.  Finally,  having  looked  over  their 
results,  users  can  refine  their  search  in  the  bottom  pane  of  the  window. 
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Figufe  15      The  reporter  window  makes  it  easy  to  scan  through  hits.  Clicking  on  a  retrieved  items/ 
"■               generates  a  preview  which  shows  an  excerpt  to  the  hit  with  the  search  terms  (Tibet  and 
China)  highlighted  in  boldface.  The  user  can  refine  the  query  in  the  lower  pane  of  the 
window.  ■  -"' 
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Cli  The  above  sequence  occurs  whenever  a  user  creates  a  new  reporter.  However,  since  users  are  likely  to  use 
many  reporters,  and  because  the  inidal  user  studies  indicated  that  ways  of  skimming  through  incoming 
information  were  important  to  the  accountants,  the  newspaper  was  provided  to  support  rapid  scanning  of  new 
information.  The  model  of  a  newspaper  is  quite  simple  (Figurd  16):  on  the  left  is  an  index  column  which 
contains  the  names  of  all  reporters,  and  to  the  right  are  two  columns  of  news.  Each  reporter  "owns"  one  news 
column  and  publishes  the  title,  date,  and  an  excerpt  of  each  item  in  its  column.  The  columns  scroll 
independendy,  using  ^minunalist'*  scroll  bars  to  prevent  the  multiple  scroll  bars  from  visually  overloading  the 
screen.  If  an  excerpt  seems  interesdng,  double  clicking  on  it  opens  the  full  ardcle  in  a  window,  from  which  it 
can  be  viewed,  printed,  or  saved.  Thus,  raOier  than  having  to  open  up  a  dozen  reporters  every  morning  to  see 
what's  new,  die  user  can  go  to  one  place,  die  newspaper. 
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Figu^  16      The  newspaper  allows  users  to  quickly  scan  through  new  items  retrieved  by  the  reporters 
which  are  working  automatically. 


c^l 
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The  newspaper  can  also  serve  as  a  control  center  for  the  Rosebud  interface.  The  user  can  open  a  reporter  by 
clicking  on  its  name  or  icon  at  the  top  of  its  news  column.  ConsequenUy,  if  a  reporter's  column  has  strayed 
from  the  desired  topic,  the  user  can  quickly  get  to  the  reporter  and  revise  its  assignment.  The  index  also  lists 
inactive  reporters  (those  either  not  automated,  or  that  have  not  found  anything  new  since  the  last  newspaper), 
so  they^teq,  can  be  opened,  aaa  automated  or  otherwise  adjusted.  -, 

A  third  component  of  the  Rosebud  interfac^the  notebook-4was  designed  but  not  unplemented.  Notebooks 
are  environments  within  which  users  may  save,  annotate,  and  organize  retrieved  information.  Notebooks  were 
designed  in  response  to  the  observations  of  Peat  Marwick  accountants,  which  indicated  the  need  for  an 
envkonment  which  supported  the  way  accountants  workedf- [n  particular,  notebooks  were  intended  to  support 
annotation/and  re/fmding  retrieved  infonnation  at  a  later  date.  A  particularly  nice  feature  of  the  notebook 
design  was  its  use  of  annotations  as  landmarks  for  re/finding  infonnation.  The  notebook  design,  and  its 
rationale,  is  described  in  jErickson  &  Salomon.  (1991). 


VJ"!       Implementation 

i  >  ?    The  Rosebud  system  consists  of  six  parts:  (l)  a  human  interface  application  written  in  SmallTalk/V  (to 

facilitate  the  rapid  changes  in  tlie  interface  necessary  to  effectively  conduct  interface  design  research);  ^)  a 
search  manager  package  which  implements  the  autonomous  agent  functionality  and  formulates  Z39.50  queries 
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for(3)  remote  Z39.50  servers  implemented  in  MPW  C  tliat  automatically  index  items  placed  in  their  input 
folders  by^)  HyperCard  stacks  that  download  new  items  from  a  Net  News  server;  (i5)  a  file  manager  component 
(MPW  C)  that  does  all  of  the  file  I/O  and  compaction  for  reporters  and  newspapers;  and^)  directory  servers 
which  allow  the  various  components  to  find  one  another.  All  of  these  components  are  written  as  separate 
applications  and  communicate  with  one  another  using  a  prototype  IPC  that  runs  over  TCP/IP.   The  file 
manager  and  search  manager  applications  run  in  the  background  under  MultiFinder,  enabling  Rosebud  to 
access  information  and  construct  newspapers  while  the  human  interface  application  is  not  running,  Like  the 
other  WAIS  interfaces,  Rosebud  uses  the  WAIS  protocol  package.  The  human  interface  was  designed  for 

Macintosh  II  class  machines,  with  13-inch  color  screens. 

A 


(    I   I  /    ~\  Observations  and  Testing  Results 

y^/  The  Rosebud  human  interface  was  subjected  to  informal  testing  on  14  users.  Users  were  told  only  that 
Rosebud  was  an  application  for  finding  information,  and  then  given  a  particular  topic  to  find  information  on. 
They  were  given  no  help  or  documentation.  Note  that  although  informal,  this  type  of  testing  is  very  stringent,/" 
in  that  users  approach  the  application  knowing  almost  nothing  about  what  it  is,  or  why  they  would  actually  use 
it.  Data  collection  consisted  simply  of  recording  their  questions,  observations,  and  problems  as  they  went 
along,  administering  a  postijest  questionnaire,  and  then  asking  them  a  few,  open-ended  questions.  Here  are  a 
few  of  the  more  general  observations. 

^|]     •      Over  80%  of  those  who  tried  the  Rosebud  interface  responded  very  positively  to  it, 
and  said  that  they  would  use  something  with  its  capacities  as  part  of  their  daily  work  routine. 
/  Two  thirds  of  users  indicated  that  they  would  usually  use  newspapers  to  browse  through 

information  (instead  of  reporters).        f  \ 

'?[}    •      At  the  end  of  the  test,  over  twoithu-ds  of  the  users  said  they  liked  the  metaphors  of  reporters 
and  newspapers;  however,  almost  all  users  had  some  difficulty  in  getting  started.  The  typical 
problem  was  that  users  did  not  associate  reporters  with  a  way  of  retrieving  information.   When 
asked  to  find  information,  users  first  looked  for  an  item  called  searcn;  when  they  didn%find 
this,  they  usually  turned  to  the  newspaper,  which  is,  in  fact,  where  they  look  for  information 
on  a  daily  basis.  It  is  possible  that  this  problem  can  be  remedied  by  minor  interface  changes 
(e.g.  putting  a  "New  Reporter"  item  in  a  search  menu);  alternatively,  it  may  be  that  the 
metaphor  is  inappropriate. 
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"  |)   •      A  number  of  users  were  led  astray  because  they  had  conceptual  models  of  information 

retrieval  based  on  their  familiarity  with  query  languages  and  structured  databases.  Such  users 
tended  to  be  wary  of  entering  search  terms  because  they  were  not  sure  of  the  appropriate 
syntax,  and  did  not  understand  what  "relevance"  meant.  Those  who  did  know  the  meaning  of 
relevance  wanted  to  know  how  the  information  server  calculated  it. 

-     '■-'  hi  '^'  ^ 

4!)  •      Users  liked  previews— rcspecially  the  feature  of  highlighting  keywords  in  boldface.  They 

wanted  to  see  boldface  keywords  in  the  newspaper  and  article  windows.  Users  also  wanted 

the  ability  to  select  text  in  the  newspaper  and  article  windows  and  change  the  style  or  font 

themselves,  so  that  they  could  annotate  significant  items.  This  parallels  practices  observed  in 

our  initial  observations  of  accountants,  where  we  found  that  annotation  plays  several 

important  roles.  \ 


<i;>, 
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I     •      A  variety  of  lowTevel  interface  problems,  due  to  terminology  or  graphic  design  were 


A 


discovered.   Some  examples:  users  did  not  usually  recognize  the  asterisks  in  the  "Best 

r\f>t 

Guesses"  window  as  indicators  of  relevance;  users  didfitt  think  that  "idle  reporters"  was  a 
good  name,  and  said  that  it  was  very  important  to  distinguish  between  reporters  that  had 
found  nothing,  and  those  that  werew  looking. 

(rii:   •■  ^r-  ^ 

The  testing  described  above  focused  on  how  usable  Rosebud  was  with  users  first  exposure.  In  the  next  phase  of 
testing,  a  small  set  of  users  will  be  observed  over  the  course  of  a  month,  in  which  they  have  the  option  of 
using  Rosebud  from  their  desktop  machines  to  access  meaningful  data.  This  phase  of  testing  will  allow  a  more 
realistic  assessment  of  Rosebud,  in  that  it  will  last  long  enough  to  permit  users  to  build  up  their  own  set  of 
reporters,  and  to  access  newspapers  that  contain  information  of  personal  import. 


CONCLUSION 

This  article  has  described  five  interfaces  developed  to  provide  access  to  distributed  systems  of  information 

servers.  The  interfaces  presented  here  were  developed  with  different  constraints  in  mind,  so  it  is  not  useful  to 

compare  them  directly;  instead,  they  may  serve  as  examples  of  differing  responses  to  issues  such  as  screen 

size,  workstation  power  and  intelligence,  communication  speeds,  and  user  needs  and  practices. 

(  ) 

The  interfaces  designed  so  far  have  addressed  some  of  the  critical  issues  for  end-users  to  accomplish 

A, 

interactive  searches  in  a  wide-area  network.  These  include  ways  of  finding  which  information  servers  contain 
relevant  information;  supporting  searching  by  ordinary  users;  and  supporting  browsing  of,  and  passive  alerting 
about,  newly  retrieved  information.  The  alerting  aspects  of  the  interfaces  have  not  been  tested  much  in  this 


^ 
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environment  due  to  the  lack  of  appropriate  data  sources  for  this  type  of  searching.  It  is  probably  fair  to  say  that 
any  of  the  design  solutions  described  here  can  be  improved  upon  by  further  work. 

C  ) 

.;  j\i     The  WAIS  Internet  experiment  has  revealed  a  number  of  issues  requiring  further  work.  In  the  Internet 

environment  we  have  observed  (in  the  logs  of  user  queries)  that  users  have  a  difficult  time  finding  out  what  is 
in  a  database,  thus  demonstrating  that  there  is  a  lack  of  browsing  or  scanning  facilities  in  the  interfaces, 
protocol,  and  servers,  as  well  as  a  general  shortage  of  descriptive  information  about  databases. 

t-l^l    Finally,  a  variety  of  other  issues  were  raised  during  the  studies  of  the  Peat  Marwick  accountants  who  have 
received  little  or  no  work.  Document  layout  is  one  such  problem.  Accountants  mentioned  that  sometimes  they 
want  to  retrieve  documents  not  because  of  the  information  they  contain,  but  to  look  at  their  layouts 
(accountants  will  often  examine  successful  proposals  to  a  client  when  preparing  a  new  proposal).  More 
generally,  users  regard  pictures,  diagrams,  tables,  and  charts  as  essential  components  of  a  document's  content. 
Unfortunately,  support  for  different  document  formats,  and  for  the  retrieval  and  display  of  nonuextual 


qi 


information  within  theni  is  very  limited  on  most  existing  clients. 

f  ( 

I )  Another  issue  is  called  the  boilerplate  problem.  Accounting  documents  often  contain  a  large  amount  of 


boilerplate  4^  standard  text  which  varies  little  from  document  to  document.  What  tools  are  needed  to  allow 

M 

users  effectively  to  retrieve,  order,  and  browse  a  large  set  of  documents  which  are  95%  similar?  Note  that 
boilerplate  is  characteristic  of  a  wide  variety  of  business  proposals  and  legal  documents,  not  just  accounting 
documents.  In  fact,  the  analog  to  boilerplate  occurs  in  scientific  documents  in  which  standard  terms  and 
descriptions  are  used  to  describe  procedures  and  methods  used  in  an  investigation. 

.  .,  i  ) 

*^- v|  A  number  of  other  issues  remain  to  be  addressed.  Users  are  very  interested  in  being  able  to  see  what  queries 

other  users  are  conducting,  and  what  information  servers  and  articles  are  most  popular.  A  frequent  suggestion  is 

to  allow  users  to  rate  the ''goodness"  of  articles  they  retrieve.  However,  in  a  commercial  setting,  information 

about  the  kind  of  questions  being  posed  by  a  particular  company  or  person  can  be  revealing  and  valuable. 

Clearly,  the  utility  that  such  information  could  provide  must  be  balanced  by  concerns  about  confidentiality  and 

privacy,  and  mechanisms  for  user  control  of  descriptive  information  are  essential.  Other  issues  include  how  to 

control  the  pricing,  copyright,  and  distribution  issues  which  accompany  *'for-pay'' information. 

(  ) 

In  summary,  there  is  an  immense  amount  of  work  to  be  done.  A  central  part  of  this  work  involves  further 

research  and  development  of  interfaces.  We  have  made  the  WAIS  system  publically  available  in  the  hope  that 

\  (i  I  ,i'^ 

designers  will  find  that  itt— |with  its  common  protocol  and  defined  infrastructure—lean  serve  as  a  platform  from 
which  to  pursue  these,  and  other,  reseai'ch  issues. 

J^      A  ^        r  '-       ^ • 

"FOR MORE-INF0RMATION.  ON  THE  WAIS  SYSTEM  -    '  I       j/^jj , ,, .    ■f'^ , , , , , "'^]., ,->,,'  3  "n 
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/jrThe  success  of  a  distributed  system  of  information  servers  depends  on  a  critical  mass  of  users  and  information 


services.   To  encourage  development  and  use,  Thinking  Machines  is  making  the  source  code  for  a  WAIS 
protocol  implementation  freely  available.   While  this  software  is  available  at  no  cost,  it  comes  with  no 


support.  We  hope  that  it  will  facilitate  others  in  developing  servers  and  clients. 


(  '  '") 


*  IjI    For  more  information,  please  contact: 

Barbara  Lincoln  Brooks  (barbara@wais.com) { WAIS  Inc.}  1040  Noel  Drivel  Menlo  Park,  CA  94025 
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